Fix/Make proc_macro span_close() and span_open() more accurate after set_span() calls #149052

Keith-Cancel · 2025-11-18T07:26:55Z

When Group::set_span() is called the span's for Group::span_open() and Group::span_close() methods just become the whole span instead where it should be the span of the delimiters. This PR ensures they have at least a more usable value after a call to Group::set_span().

Here is a kinda of motivating example of this issue I observed. If you compare the current behavior the outputs the diagnostics will not be on the opening and closing delimiters after calling set span. If you look at the outputs after using this PR they will match. Further, afterward Span::eq() then evaluates to true which nice to see.

#![feature(proc_macro_diagnostic)]
#![feature(proc_macro_span)]
extern crate proc_macro;

use proc_macro::TokenStream;
use proc_macro::TokenTree;

#[proc_macro_attribute]
pub fn foo(args: TokenStream, _: TokenStream) -> TokenStream {
    let mut i = args.into_iter();
    let mut grp = match i.next().expect("iterator empty") {
    TokenTree::Group(g) => g,
        _ => panic!("Expected Group")
    };

    let o_open = grp.span_open();
    let o_close = grp.span_close();
    o_open.note("Old Open Span").emit();
    o_close.note("Old Close Span").emit();

    // Set the span with the same span
    grp.set_span(grp.span());

    // Notice where the span these diagnostics are compared to the first.
    let n_open = grp.span_open();
    let n_close = grp.span_close();
    n_open.note("New Open Span").emit();
    n_close.note("New Close Span").emit();


    grp.span().note(&format!(
        "old_open == new_open: {}\nold_close == new_close: {}",
        o_open.eq(&n_open),
        o_close.eq(&n_close)
    )).emit();

    return TokenStream::new();
}

This is all I did to invoke the proc_macro example above.

#[macro_test::foo({ bar })]
struct Foo;

rustbot · 2025-11-18T07:26:59Z

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

petrochenkov · 2025-11-18T10:10:28Z

DelimSpan::from_single in the compiler also currently doesn't try to update the opening and closing spans in a smart way, just sets the whole span.

I think it would be better to test the opening/closing span recovery strategy on the compiler first, and then do the same for proc macros.
There are multiple alternatives, and in some of them the choices here are obvious, but in some not and need to be checked experimentally (what diagnostic changes they cause).

The delimiter is invisible
- It's not clear what is better to use, the whole span, or empty spans at the end and beginning, need to check experimentally
The delimiter is visible and the source map actually contains the matching symbols at the beginning and end of the span
- The obvious choice - use spans of those symbols
The delimiter is visible, but the source map doesn't contain the matching symbols neither at the beginning nor at the end
- It's not clear what is better to use, the whole span, or empty spans at the end and beginning, need to check
The delimiter is visible, but the source map only contains one matching symbols either at the beginning or at the end, the weird case
- It's still not clear what is better to use for both the matching delimiter (its symbol's span, whole span, or empty span) and the non-matching delimiter (whole span or empty span), need to check

Keith-Cancel · 2025-11-18T11:49:42Z

DelimSpan::from_single in the compiler also currently doesn't try to update the opening and closing spans in a smart way, just sets the whole span.

I think it would be better to test the opening/closing span recovery strategy on the compiler first, and then do the same for proc macros. There are multiple alternatives, and in some of them the choices here are obvious, but in some not and need to be checked experimentally (what diagnostic changes they cause).
* The delimiter is invisible
  
  * It's not clear what is better to use, the whole span, or empty spans at the end and beginning, need to check experimentally

* The delimiter is visible and the source map actually contains the matching symbols at the beginning and end of the span
  
  * The obvious choice - use spans of those symbols

* The delimiter is visible, but the source map doesn't contain the matching symbols neither at the beginning nor at the end
  
  * It's not clear what is better to use, the whole span, or empty spans at the end and beginning, need to check

* The delimiter is visible, but the source map only contains one matching symbols either at the beginning or at the end, the weird case
  
  * It's still not clear what is better to use for both the matching delimiter (its symbol's span, whole span, or empty span) and the non-matching delimiter (whole span or empty span), need to check

I can look at adding something to the compiler side of things first. I suppose we could inspect parts of the stream itself if that is desired. However, at least on the proc_macro side I did not do that. That's because someone could assign a span not directly associated to the underlying stream that the group contains. Therefore, inspecting the stream may not provide any useful information. Thusly, I chose what I did since it would more accurate in most cases than just using the whole span.

From a quick look the DelimSpan I see popup on the compiler side only stores the open and close spans so I would need to investigate a little more. I will say at least most cases that I have seen any Group tokens coming from the compiler have seem to have the correct open and close for anything it parsed/lexed.

As for the four options:

None/Invisible Group
- Open and close clearly indicate we should be talking about the start and end of that group of tokens. The fact we have a Group means its has at least logically/structurally delimited. However there are no Glyphs or Symbols to point to directly. So the span in my opinion should be zero width here.
Visible and matches up
- Like you said this one is obvious
Delimiter is set to a visible variant, but there are no delimiters in the token stream.
- If we are wanting to inspect the token stream. In this case we have the logical/structural delimitation provided by the Group. However, there is no symbol/glyph that logically make sense to point out. So I guess like the first case a Zero width start and end. Just marking the where the group is considered opened and closed would be the most logical choice.
Only one Visible delimiter
- One span would have width, the other would have zero width matching the reasoning I gave above. In the case we at least have one symbol/glyph we can point to. The other-side we should just mark where it starts/ends.

As for the diagnostic output is there anything you're particularly looking for or wanting to avoid when you mention experimentally testing?

petrochenkov · 2025-11-18T11:59:10Z

I suppose we could inspect parts of the stream itself if that is desired.

... but there are no delimiters in the token stream

I was not suggesting to inspect token stream, "source map" is some source text to which all spans map, not tokens.

petrochenkov · 2025-11-18T12:00:00Z

As for the diagnostic output is there anything you're particularly looking for or wanting to avoid when you mention experimentally testing?

Just want to look at the results with different options for a start.

rustbot · 2025-11-19T04:28:24Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Keith-Cancel · 2025-11-19T05:10:01Z

As for the diagnostic output is there anything you're particularly looking for or wanting to avoid when you mention experimentally testing?

Just want to look at the results with different options for a start.

So I have here the from_single in with the default being an empty close and open. It causes UI tests to fail. Looking at some of the diffs I would says some are an improvement, but others I would call a regression. If you set the default_open and default_close to be the whole span as before the UI tests pass.

Honestly, in the compiler I would say that it might better to have not just some catch all from_single method, but maybe a couple different ones where the caller has more context, and make a better choice on which one to call when constructing a DelimSpan from a single span.

This still does nothing about my main concern though about fixing the proco_macro::Group in the proc_macro lib loosing the open and closing delimiter spans after a set_span which is my main concern with the initial pr.

petrochenkov · 2025-11-19T12:20:27Z

Could you update the tests (to make CI green) and submit the compiler changes in a separate PR?
@rustbot author

…r set_span call. This tries to recover the open in and close delimiter spans for the new span, otherwise it reverts to old behavior of using the the whole span for the open and close delimiter.

Keith-Cancel · 2025-11-25T21:17:29Z

Okay, I updated the this to only handle the obvious case. I also make sure the the current delimiter of the group matches if not it just goes to the old behavior.

I also thought about maybe adding this, but was not sure we wanted to loop through the source text like this? My reasoning was a span in theory could be for a fragment of text like this { ... } { ... }, and we probably want the default in case like this.

        let mut depth = 1usize;
        let mut iter = src.iter().skip(1);
        while depth > 0
            && let Some(&c) = iter.next()
        {
            if c == open_chr {
                depth += 1;
            }
            if c == close_chr {
                depth -= 1;
            }
        }
        // We possibly don't have a true matching pair because the depth did not
        // reach zero or the iterator did not finish. This could be a false
        // positive because of comments and literals. However, we are falling
        // back to the default.
        if depth != 0 || iter.next().is_some() {
            self.0.span = bridge::DelimSpan::from_single(span.0);
            return;
        }

The false positive I am talking about is something like this:

{ /* comment }  */ let _code = 0; }

So it would revert to old behavior, however in cases where there is no delimiter character in like a string literal or comment it still would allow proc_macros to emit better diagnostics after setting the span.

@rustbot ready
r? @petrochenkov
@rustbot label T-libs

-- Edit --
Also there a way to clean up some of the rustbot noise in a PR. I made a mistake when trying to update to the main origin and it kinda made a mess of things.

rustbot · 2025-11-25T21:21:11Z

Requested reviewer is already assigned to this pull request.

Please choose another assignee.

Veykril · 2025-11-26T10:36:48Z

library/proc_macro/src/lib.rs

+
+        // No source text to check or source text is too small.
+        // just go back to default behavior.
+        let source = match span.0.source_text() {


This seems like quite the expensive operation to call here. For one this goes through the bridge (which is expensive for rust-analyzer, especially this call, as once we implement it will always do an RPC roundtrip), but that aside, this also always allocates a new string.

The allocation and the further bridge calls to subspan can be avoided by moving this whole logic into the compiler's side of the bridge.

An alternative solution is to just add fn set_span_(open,close) to the API.
Typically the user of set_span knows whether the passed span actually contains the group's source text, and if it's known, then set_span_(open,close) can be called as well.
I don't suspect it happens often, I'd expect set_span to be set to something semi-related from the macro input.

If we do this, then #149229 probably shouldn't be merged.

If we do this, then #149229 probably shouldn't be merged.

Or it can be merged if set_span_open changes only the opening span, set_span_close changes only the closing span, and set_span changes both to the same value.

This would also solve the question of what goes back to the compiler when proc_macro::Span is converted to compiler::Span.

If we do this, then #149229 probably shouldn't be merged.

Or it can be merged if set_span_open changes only the opening span, set_span_close changes only the closing span, and set_span changes both to the same value.

This would also solve the question of what goes back to the compiler when proc_macro::Span is converted to compiler::Span.

A set_span_open() and set_span_close(), only effecting their respective fields and span set_span() overwriting both seems fine. Also if we want to add those API or similar API, #149229 makes even more sense because the entire field becomes more awkward with those proposed API.

Although for a public API we probably want that API to be set_span_open_close(Span, Span) to at least to make sure the spans are from the same file.

The allocation and the further bridge calls to subspan can be avoided by moving this whole logic into the compiler's side of the bridge.

That is a good point and certainly do able.

to at least to make sure the spans are from the same file

That also requires going through the bridge though.

petrochenkov · 2025-11-26T11:01:51Z

My reasoning was a span in theory could be for a fragment of text like this { ... } { ... }

All of this is a heuristic, using the full span is also not a precise solution, so it's fine, I think.

Keith-Cancel · 2025-11-26T18:15:40Z

My reasoning was a span in theory could be for a fragment of text like this { ... } { ... }

All of this is a heuristic, using the full span is also not a precise solution, so it's fine, I think.

I also realized after going to bed last night such a case is probably very unlikely under typical usage. This is because the compiler would have broken those up into separate group tokens.

rustbot assigned petrochenkov Nov 18, 2025

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 18, 2025

This comment has been minimized.

Sign in to view

Keith-Cancel marked this pull request as draft November 18, 2025 07:38

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 18, 2025

Keith-Cancel marked this pull request as ready for review November 18, 2025 07:42

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 18, 2025

This comment has been minimized.

Sign in to view

petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 18, 2025

This comment has been minimized.

Sign in to view

rustbot added the has-merge-commits PR has merge commits, merge with caution. label Nov 19, 2025

Keith-Cancel force-pushed the main branch from 4781cdf to 82082f1 Compare November 19, 2025 04:58

This comment has been minimized.

Sign in to view

petrochenkov added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 19, 2025

rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 19, 2025

Make Group::span_open() and Group::span_close()` more accurate afte…

d255ff6

…r set_span call. This tries to recover the open in and close delimiter spans for the new span, otherwise it reverts to old behavior of using the the whole span for the open and close delimiter.

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 25, 2025

Keith-Cancel marked this pull request as ready for review November 25, 2025 21:22

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Nov 25, 2025

Veykril reviewed Nov 26, 2025

View reviewed changes

petrochenkov mentioned this pull request Nov 26, 2025

Make proc_macro::Group take up less space remove redundant field. #149229

Open

petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 26, 2025

Fix/Make proc_macro span_close() and span_open() more accurate after set_span() calls #149052

Are you sure you want to change the base?

Fix/Make proc_macro span_close() and span_open() more accurate after set_span() calls #149052

Conversation

Keith-Cancel commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Nov 18, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

petrochenkov commented Nov 18, 2025

Uh oh!

Keith-Cancel commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petrochenkov commented Nov 18, 2025

Uh oh!

petrochenkov commented Nov 18, 2025

Uh oh!

rustbot commented Nov 19, 2025

Uh oh!

This comment has been minimized.

Keith-Cancel commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

petrochenkov commented Nov 19, 2025

Uh oh!

Keith-Cancel commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Nov 25, 2025

Uh oh!

Veykril Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petrochenkov Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

petrochenkov Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

petrochenkov Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Keith-Cancel Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Keith-Cancel Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

petrochenkov Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

petrochenkov commented Nov 26, 2025

Uh oh!

Keith-Cancel commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Keith-Cancel commented Nov 18, 2025 •

edited

Loading

Keith-Cancel commented Nov 18, 2025 •

edited

Loading

Keith-Cancel commented Nov 19, 2025 •

edited

Loading

Keith-Cancel commented Nov 25, 2025 •

edited

Loading

Veykril Nov 26, 2025 •

edited

Loading

petrochenkov Nov 26, 2025 •

edited

Loading

Keith-Cancel Nov 26, 2025 •

edited

Loading